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Abstract This paper deals with a linear model of regression on quantiles 
when the explanatory variable takes values in some functional space and the 
response is scalar. We propose a spline estimator of the functional coefficient 
that minimizes a penalized L l type criterion. Then, we study the asymptotic 
behavior of this estimator. The penalization is of primary importance to get 
existence and convergence. 

Key words Functional data analysis, conditional quantiles, 5-spline func- 
tions, roughness penalty. 

1 Introduction 

Because of the increasing performances of measurement apparatus and com- 
puters, many data are collected and saved on thinner and thinner time scales 
or spatial grids (temperature curves, spectrometric curves, satellite images, 
. . . ). So, we are led to process data comparable to curves or more gener- 
ally to functions of continuous variables (time, space). These data are called 
functional data in the literature (see Ramsay and Silverman, 2002). Thus, 
there is a need to develop statistical procedures as well as theory for this kind 
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of data and actually many recent works study models taking into account 
the functional nature of the data. 

Mainly in a formal way, the oldest works in that direction intended to 
give a mathematical framework based on the theory of linear operators in 
Hilbert spaces (see Deville, 1974, Dauxois and Pousse, 1976). After that 
and in an other direction, practical aspects of extensions of descriptive sta- 
tistical methods like for example Principal Component Analysis have been 
considered (see Besse and Ramsay, 1986). The monographs by Ramsay and 
Silverman (1997, 2002) are important contributions in this area. 

As pointed out by Ramsay and Silverman (1997), "the goals of functional 
data analysis are essentially the same as those of other branches of Statistics" : 
one of this goal is the explanation of variations of a dependent variable Y 
(response) by using information from an independent functional variable X 
(explanatory variable). In many applications, the response is a scalar: see 
Frank and Friedman (1993), Ramsay and Silverman (1997), ... Traditionally, 
one deals, for such a problem, with estimating the regression on the mean 
i.e. the minimizer among some class of functionals r of 

E[(F-r(X)) 2 ]. 

As when X is a vector of real numbers, the two main approaches are linear 
(see Ramsay and Dalzell, 1991, for the functional linear model) or purely 
nonparametric (see Ferraty and Vieu, 2002, which adapt kernel estimation 
to the functional setting). It is also known that estimating the regression on 
the median or more generally on quantiles has some interest. The problem 
is then to estimate the minimizer among g a of 

E[l a (Y-g a (X))\, (1) 

where l a (u) = \u\ + (2a — l)u. The value a = 1/2 corresponds to the condi- 
tional median whereas values a g]0, 1[ correspond to conditional quantiles of 
order a. The advantage of estimating conditional quantiles may be found in 
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many applications such as in agronomy (estimation of yield thresholds), in 
medicine or in reliability Besides robust aspects of the median, it may also 
help to derive some kind of confidence prediction intervals based on quantiles. 

In our work, we assume that the conditional quantile of order a can be 
written as 

g a (X) = (* a ,X), (2) 

where < ., . > is a functional inner product and the parameter of the model 
ty a is a function to be estimated. This is the equivalent of the linear model 
for regression quantiles studied by Koenker and Bassett (1978) where the 
inner product is the Euclidean one and the parameter is a vector of scalars. 
We choose to estimate the function \l/ a by a "direct" method: writing our 
estimator as a linear combination of B-splines, it minimizes the empirical 
version of expectation ([T|) with the addition of a penalty term proportional 
to the square norm of a given order derivative of the spline. The penalization 
term allows on one side to control the regularity of the estimator and on the 
other side to get consistency. 

Unlike for the square function, minimization of function l a does not lead 
to an explicit expression of the estimator. While computation of the esti- 
mator can be resolved by using traditional algorithms (for instance based on 
Iteratively Weighted Least Squares), the convexity of l a allows theoretical 
developments. 

In section [21 we define more precisely the framework of our study and the 
spline estimator of the functional parameter Section [3] is devoted to the 
asymptotic behaviour of our estimator: we study L 2 convergence and derive 
an upper bound for the rate of convergence. Comments on the model and 
on the optimality of the rate of convergence are given in section HI Finally, 
the proofs are gathered in section [5l 



3 



2 Construction of the estimator 

In this work, the data consist of an i.i.d. sample of pairs (JQ, 3^)i=i,... )n drawn 
from a population distribution (X,Y). We consider explanatory variables 
Xi which are square integrable (random) functions defined on [0, 1], i.e. are 
elements of the space L 2 ([0, 1]) so that Xi = (JQ(t),t G [0, 1]). The response 
Yi is a scalar belonging to 1R. Assume that H, the range of X, is a closed 
subspace of L 2 ([0,1]). For Y having a finite expectation, E(|F|) < +oo, 
and for a g]0, 1[, the conditional a-quantile functional g a of Y given X is a 
functional defined on H minimizing ([T]). 

Our aim is to generalize the linear model introduced by Koenker and 
Bassett (1978). In our setting, it consists in assuming that g a is a linear and 
continuous functional defined on H and then it follows that g a {X) can be 
written as in ([2]). Taking the usual inner product in L 2 ([0, 1]), we can write 



where fy a is the functional coefficient in H to be estimated, the order a being 
fixed. From now on we consider, for simplicity, that the random variables Xi 
are centered, that is to say E(Xj(t)) = 0, for t a. e. 

When X is multivariate, Bassett and Koenker (1978) study the least 
absolute error (LAE) estimator for the conditional median, which can be 
extended to any quantile replacing the absolute value by the convex function 
l a in the criterion to be minimized (see Koenker and Bassett, 1978). In our 
case where we have to estimate a function belonging to an infinite dimensional 
space, we are looking at an estimator in the form of an expansion in some 
basis of 5-splines functions and then minimizing a similar criterion with 
however the addition of a penalty term. 

Before describing in details the estimation procedure, let us note that es- 
timation of conditional quantiles has received a special attention in the multi- 
variate case. As said before, linear modelling has been mainly investigated by 
Bassett and Koenker (1978). For nonparametric models, we may distinguish 
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two different approaches: "indirect" estimators which are based on a pre- 
liminary estimation of the conditional cumulative distribution function (cdf) 
and "direct" estimators which are based on the minimizing the empirical 
version of criterion (CQ). In the class of "indirect" estimators, Bhattacharya 
and Gangopadhyay (1990) study a kernel estimator of the conditional cdf, 
and estimation of the quantile is achieved by inverting this estimated cdf. 
In the class of "direct" estimators, kernel estimators based on local fit have 
been proposed (see Tsybakov, 1986, Lejeune and Sarda, 1988 or Fan, Hu and 
Truong, 1994); in a similar approach, He and Shi (1994) and Koenker et. al. 
(1994) propose a spline estimator. Although our setting is quite different, we 
adapt in our proofs below some arguments of the work by He and Shi (1994). 

In nonparametric estimation, it is usual to assume that the function to 
be estimated is sufficiently smooth so that it can be expended in some basis: 
the degree of smoothness is quantified by the number of derivatives and a 
lipschitz condition for the derivative of greatest order (see condition (H.2) 
below). It is also quite usual to approximate such kind of functions by means 
of regression splines (see de Boor, 1978, for a guide for splines). For this, we 
have to select a degree q in N and a subdivision of [0, 1] defining the position 
of the knots. Although it is not necessary, we take equispaced knots so that 
only the number of the knots has to be selected: for k in N*, we consider 
k — 1 knots that define a subdivision of the interval [0, 1] into k sub-intervals. 
For asymptotic theory, the degree q is fixed but the number of sub-intervals k 
depends on the sample size n, k = k n . It is well-known that a spline function 
is a piecewise polynomial: we consider here piecewise polynomials of degree 
q on each sub-interval, and (q — 1) times differentiable on [0, 1]. This space 
of spline functions is a vectorial space of dimension k + q. A basis of this 
vectorial space is the set of the so-called normalized -B-spline functions, that 
we note by B k<q = (B h B k+q ) T . 

Then, we estimate ^ a by a linear combination of functions B[. This leads 
us to find a vector 6 = ... , 6k+ q ) T in M. k+q such that 
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k+q 



(3) 



1=1 



It is then natural to look for ty a as the minimizer of the empirical version of 
(PP) among functional g a of the form with functions belonging to the 
space of spline functions defined above. We will however consider a penalized 
criterion as we will see now. In our setting, the pseudo-design matrix A is 
the matrix of dimension n x (k + q) and elements (X i; Bj) for i = 1, . . . , n and 
j = 1, . . . , k + q. Even if we do not have an explicit expression for a solution 
to the minimization problem, it is known that the solution would depend on 
the properties of the inverse of the matrix ^A T A which is the (k + q) x (k + q) 
matrix with general term (T n (Bj), Bi), where T n is the empirical version of 
the covariance operator Fx of X defined for all u in L 2 ([0, 1]) by 



We know that Tx is a nuclear operator (see Dauxois et al, 1982), consequently 
no bounded inverse exists for this operator. Moreover, as a consequence of 
the first monotonicity principle (see theorem 7.1, p. 58, in Weinberger, 1974), 
the restriction of this operator to the space of spline functions has smaller 
eigenvalues than Tx- Finally, it appears to be impossible to control the speed 
of convergence to zero of the smallest eigenvalue of ^A T A (when n tends to 
infinity): in that sense, we are faced with an inversion problem that can be 
qualified as ill-conditioned. A way to circumvent this problem is to introduce 
a penalization term in the minimization criterion (see Ramsay and Silverman, 
1997, or Cardot et al, 2003, for a similar approach in the functional linear 
model). Thus, the main role of the penalization is to control the inversion of 
the matrix linked to the solution of the problem and it consists in restricting 
the space of solutions. The penalization introduced below will have another 
effect since we also want to control the smoothness of our estimator. For 
this reason, and following several authors (see references above), we choose a 
penalization which allows to control the norm of the derivative of order m > 



T x u = E((X,u)X). 



(4) 
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of any linear combination of B-spline functions, so that it can be expressed 
matricially. Denoting by (B£ q 0) <ym ' 1 the m-th derivative of the spline function 
B£ 9 #, we have 

1 1 (B T k q 0) {m) 1 1 2 = T G k 6, V6» G R k+q , 

where G& is the (k+q) x (k+q) matrix with general term [GjJj-j = (Bj m \ B^). 

Then, the vector 6 in ([3]) is chosen as the solution of the following mini- 
mization problem 

nun {-^(^-(B^X^+pll (B^O)^ A, (5) 
v i=i J 
where p is the penalization parameter. In the next section, we present a 

convergence result of the solution of (jHJ). Note that the role of the penalization 

also clearly appears in this result. 



3 Convergence result 

We present in this section the main result on the convergence of our estimator. 
The behaviour of our estimator is linked to a penalized version of the matrix 
C = - A T A. More precisely, adopting the same notations as in Cardot et. al. 
(2003), the existence and convergence of our estimator depend on the inverse 
of the matrix C p = C + pGk- Under the hypotheses of theorem [1] below, the 
smallest eigenvalue of C p , noted A min (C p ), tends to zero as the sample size n 
tends to infinity. As the rate of convergence of ty a depends on the speed of 
convergence of A m i n (C p ) to zero, we introduce a sequence (r^JngN such that 
the set Q n defined by 

Q n = |cj/A min (C /3 ) > cr? n j , (6) 

has probability which goes to 1 when n goes to infinity. Cardot et al. (2003) 
have shown that such a sequence exists in the sense that under hypotheses 
of theorem [U there exists a strictly positive sequence (r/n)neN tending to zero 
as n tends to infinity and such that 
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A mi „(C p ) > c Vn + o P {{kin 1 - 5 )- 1 ' 2 ) , (7) 

with 5 G]0,1[. 

To prove the convergence result of the estimator we assume that the 
following hypotheses are satisfied. 
(H.l) || X \\< C < +00, a.s. 

(if. 2) The function is supposed to have a p'-th derivative ^ such 
that 

where Ci > and z/ G [0, 1]. In what follows, we set p = p' + v and we 
suppose that q > p > m. 

(H.3) The eigenvalues of Tx (defined in (TJJ) are strictly positive. 

(HA) For x G H, the random variable e defined by e = K — (^q,, X) has 
conditional density function f x given X = 2, continuous and bounded below 
by a strictly positive constant at 0, uniformly for x G H. 

We derive in theorem [1] below an upper bound for the rate of convergence 
with respect to some kind of L 2 -norm. Indeed, the operator Tx is strictly 
non-negative, so we can associate it a semi-norm noted ||.|| 2 and defined by 
1 1 -zx 1 1 2 — (Txu,u). Then, we have the following result. 

Theorem 1 Under hypotheses (H.l) — (HA), if we also suppose that there 
exists /3,7 in ]0, 1[ such that k n ~ n@ ' , p ~ and r\ n ~ n~ l3 ~~( 1 ~ 5 ^ 2 (where 
5 is defined in relation ffity), then 

(i) ty a exists and is unique except on a set whose probability goes to zero 
as n goes to infinity, 

(a) 11 § Q - * a ni= o P ( -L + — + + pki^-A . 

\kn p n Vn k n r] n J 

4 Some comments 

(i) Hypotheses (H.l) and (H.3) are quite usual in the functional setting: see 
for instance Bosq (2000) or Cardot et al. (2003). Hypothesis (HA) implies 
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uniqueness of the conditional quantile of order a. 

(ii) Some arguments in the proof of theorem [T] are inspired from the demon- 
stration of He and Shi (1994) within the framework of real covariates. More- 
over, some results from Cardot et. al. (2003) are also useful, mainly to deal 
with the penalization term as pointed out above. 

Note that it is assumed in the model of He and Shi (1994) that the error term 
is independent of X: condition (H.4) allows us to deal with a more general 
setting, as in Koenker and Bassett (1978). 

(iii) It is possible to choose particular values for (3 and 7 to optimize the upper 
bound for the rate of convergence in theorem [TJ In particular, we remark the 
importance to control the speed of convergence to of the smallest eigenvalue 
of C p by rj n . For example, Cardot et al. (2003) have shown that, under 
hypotheses of theorem [H relation (J7D is true with rj n = p/k n . This gives us 

ne.-*.iii=op(^ + ^ + P + ^'). 

A corollary is obtained if we take k n ~ nVwH- 1 ) and p ~ n - 2 'p/{ 4 P+ 1 )^ then we 
get 

|| \\l= P (n-^+Dy 

We can imagine that, with stronger hypotheses on the random function X, we 
can find a sequence r] n greater than p/k n , that will improve the convergence 
speed of the estimator. As a matter of fact, the rate derived in theorem [1] does 
not imply the rate obtained by Stone (1982), that is to say a rate of order 
n -2p/(2p+i)_ T nc l eec l ; suppose that l/k 2p , l/(nrj n ) and p 2 /(k n r) n ) are all of 
order n ~ 2p ^ 2p+1 \ This would imply that k n ~ tiVPp+i) an d Vn „ n -i/(2p+i)^ 
which contradicts the condition r\ n ~ n -P-( l ~ 5 )l 2 . Nevertheless, it is possible 
to obtain a speed of order n -2p/(2 P +i)+^ This leads to kn „ n i/(2 P +i)-«/(2 P ) 
and r] n ~ n - 1 /(2p+i)-K_ Then, the condition r/ n ~ ^-/M 1 - 5 )/ 2 j m plies k = 
p(l - 8)/{2p + 1). So finally, we get k n ~ n ^^ 2p+1 \ p ~ n (-#-i+5)/4(2 P +i) 
and r/n ~ n (-p- 1 +p <5 )/( 2 P+i)_ convergence result would be then 
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|| § a -^ Q \\l=0 P ( n -Ki+*)/(2p+D) . 

A final remark is that the last term pk^ m ^ of the speed in theorem [1] is not 
always negligible compared to the other terms. However, it will be the case 
if we suppose that m < p/(l + 8) + (1 — <5)/4(l + 5). 

(iv) This quantile estimator is quite useful in practice, specially for fore- 
casting purpose (by conditional median or inter-quantiles intervals). From a 
computational point of view, several algorithms may be used: we have im- 
plemented in the R language an algorithm based on the Iterated Reweighted 
Least Square (IRLS). Note that even for real data cases, the curves are always 
observed in some discretization points, the regression splines is easy to im- 
plement by approximating inner products with quadrature rules. The IRLS 
algorithm (see Ruppert and Carroll, 1988, Lejeune and Sarda, 1988) allows 
to build conditional quantiles spline estimators and gives satisfactory fore- 
cast results. This algorithm has been used in particular on the "ORAMIP" 
( "Observatoire Regional de 1' Air en Midi-Pyrenees" ) data to forecast pollu- 
tion in the city of Toulouse (France): the results of this practical study are 
described in Cardot et. al. (2004). We are interested in predicting the ozone 
concentration one day ahead, knowing the ozone curve (concentration along 
time) the day before. In that special case, conditional quantiles were also 
useful to predict an ozone threshold such that the probability to exceed this 
threshold is a given risk 1 — a. In other words, it comes back to give an 
estimation of the a-quantile maximum ozone knowing the ozone curve the 
day before. 



5 Proof of theorem [T] 

The proof of the result is based on the same kind of decomposition of ^f a — 
as the one used by He and Shi (1994). The main difference comes from 
the fact that our design matrix is ill-conditioned, which led us to add the 
penalization term treated using some arguments from Cardot et al. (2003). 
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Hypothesis (H.2) implies (see de Boor, 1978) that there exists a spline func- 
tion = B£ 0*, called spline approximation of ty a , such that 

SUp |«S(*)-*a(«)|<S. (8) 
ie[0,l] K n 

In what follows, we set Ri = — Xj); so we deduce from (jHJ) and from 
hypothesis (-ff.l) that there exists a positive constant C3 such that 

C 

max \Ri\ < — =-, a.s. (9) 

i=l,...,n fc£ 

The operator T n allows to define the empirical version of the L 2 norm by 
\\ u \\n = (r n u,u). At first, we show the result (ii) of theorem CD for the 
penalized empirical L 2 norm. Writing ty a — ty a = (ty a — \1>*) + — \l/ a ), 
we get 



||$ a -* Q ||2+p||($ Q -^)M|| 2 

n n 

n z — ' n L — ' 

i=l 1=1 

+2p||(§ a - ^)Mf + 2p||(^ - * a )^)|| 2 . 

Now, using again hypothesis (H.l), we get almost surely and for all i = 
1, . . . ,n, the inequality (\&* — \l/ a ,Xj) 2 < CqC 2 /^. Moreover, lemma 8 of 
Stone (1985) gives us the existence of a positive constant C4 that satisfies 
|| (* a - **) (m) || 2 < C 4 kl {m ' p) . So we deduce 



\\$ Q -y a \\ 2 n + P\\(y*-y«) {m) \\ 2 

< ~ £<*a - K,XiV + 2p||($a - ^) M || 2 
i=l 

+^ + 2C 4P ^), a.*. (10) 
Our goal is now to compare our estimator \I/ a with the spline approximation 

^ —1/2 

^P*. For that, we adopt the following transformation 6 = C p f3 + 6*. Then, 
we define on the set fl n 
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We notice that minimizing X^=i fi(@) comes back to the minimization of the 
criterion (jSJ). We are interested by the behaviour of the function fa around 
zero: fi(0) is the value of our loss criterion when 6 = 6*. Let us also notice 
that the inverse of the matrix C p appears in the definition of /j. This inverse 
exists on the set fl n defined by @, and which probability goes to 1 as n goes 
to infinity Lemma Q] below, whose proof is given in section 15.11 allows us to 
get the results (i) and (ii) of theorem [1] for the penalized empirical L 2 norm. 

Lemma 1 Under the hypotheses of theorem [7J for all e > 0, there exists 
L = L e (sufficiently large) and (<5 n )„, eN with S n = ^l/(nr] n ) + p 2 /(k n r] n ) 
such that, for n large enough 



i=l i=l 



> 1 - e. 



Using convexity arguments, this inequality means that the solution f3 exists 
and is unique on the ball centered in 0* and of radius L5 n . As we use the 

«~ —1/2 

one-to-one transformation — C p (3 + 0* on the set we deduce the 
existence and the uniqueness of the solution of ([5]) on the set Q n , which 
proves point (i) of theorem HJ 

Now, let e be strictly positive; using the convexity of function fi, there exists 
L = L e such that, for n large enough 



P 



l( H£/.«3> >£/•«» 

i=l i=l 



> 1 



On the other hand, using the definition of fi and the minimization criterion 
©, we have 



12 



1 - 

1 - 



inf 



(m) 



i=l 



so we finally get 



n 1 n 

i^/,(cfg-ey^)<l^/,(o). 

i=l i=l 



Then, combining this with equation we obtain 



n n 



la.. 

«=i i=i 
Now, using the definition of C p , we have 



> 1 - e. 



(12) 



P 
= 1 

> P 



,*\(m) 



1 ^ ^ 

n 

i=i 

-P[|cf (0-O*)\ >L6 n ] 

n n 

inS £/,(/3)>£/<(c^0-CyV) 



< L 2 S 2 



With relation ffT2"]) . this last probability is greater than 1 — e, so we obtain 



1 n ~ 

i=l 



\(m) 



< b <\n ""n.'\r< 



This last result, combined with inequality ( flOl) finally gives us the equivalent 
of result (ii) for the penalized empirical L 2 norm. Point (ii) ( with the norm 
||.|| 2 ) then follows from lemma [5] below, which is proved in section [o3| and 
achieves the proof of theorem [T] (ii). 
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Lemma 2 Let f and g be two functions supposed to be m times differentiable 
and such that 



\\f-g\\l + p\\ (f-g) 



(m) II 2 



P {u n ) 



with u n going to zero when n goes to infinity. Under hypotheses (H.l) and 
(H.3) and if moreover \\g\\ and \\g^\\ are supposed to be bounded, we have 

\\f-g\\l = P (u n ). 
5.1 Proof of lemma [I] 

This proof is based on three preliminary lemmas, proved respectively in sec- 
tions I5.2[ 15.31 and 15.41 We denote by T n the set of the random variables 
(Xi, . . . , X n ). Under hypotheses of theorem [H we have the following results. 

Lemma 3 There exists a constant C5 such that, on the set Q n defined by 
(EP, we have 



max 

i=l,...,n 



(Br czWfrx*) 



, C 5 \(3\ 

< — , a.s. 



Lemma 4 For all e > 0, there exists L = L e such that 



lim P 

n— >+oo 



M^2(fi{LS n p) - MO) - E \fi{LS n /3) - fi(0)\T n ]) > e8; 
1 i=i 



n 



0. 



Lemma 5 For all e > 0, there exists L = L e such that 



M^EiMLSnP) - fi{0)\T n ] > 



5ln 



> 1 - e. 



These three lemmas allow us to prove lemma [U Indeed, let L be a strictly 
positive real number; we make the following decomposition 



M fi(L5 n (3) - /i(0) >A n + B n , 



8=1 



8=1 
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with 



8=1 



and 



i=i 



Using lemmas H] and [5], we can find L sufficiently large such that, for n large 
enough 



and 



P(\A n \>5 2 n n) <e/2, 



P (B n > 6 2 n n) > 1 - e/2, 



thus we get 



P 



i=i i=i 

which achieves the proof of lemma [U 



> P {A n + B n > 0) > 1 - e, 



5.2 Proof of lemma i 

Using lemma 6.2 of Cardot et al. (2003), we have 



A min (C p ) > C' 5 r, n + op((k 2 y- s )- 1/2 )- 



Noticing that 
that 



(BLC p - 1/2 /3,X 8 ) < (BI^X^Cr^B^,^)^! 2 , we deduce 



(BLCr^.x,-) 



< (B£_ ) X i >(B k , q) X i >|/3| s 
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(B]; jq C; 1/2 /3,Xi) " < C'M 2 /(k n r) n ) +o P (n( 5 - 1 )/ 2 ) almost 



which gives us 
surely, and achieves the proof of lemma [31 



5.3 Proof of lemma i 

Considering the definition of functions and l a , we have 



sup (fi(LSnP) - fi{0) - E [/i(M»/3) - /i(0)|T n ] 



|/3|<1 



i=l 



71 

sup V (L - L5 n { W ktq C- x ' 2 ^Xi) - R 

l/3|<l^ VI 

-E[|e i -L5 n (B^C; 1 /2 AX .)_ jR! 



— |e» — -R. 

- -RJ IT, 



where ei, . . . , e n are n real random variables independent and identically dis- 
tributed defined by 6j = Yj, — {9 a , Xj) for all i = 1, . . . , n. Let us also denote 
A, ; (/3) = £j — L5 n (BT C p (3,Xi) — Ri — |e* — To prove lemma HI it 
suffices to show that, for all e > 0, there exists L = L f such that 



lim P sup Y [Ai(/3) - E(Ai()9)|r n )] > e£n I 0. 



Let e be a real number strictly positive and C the subset of ¥L k+q defined by 
C = |/3 G M fc+9 /|/3| < l}. As C is a compact set, we can cover it with open 
balls, that is to say C = U^=l w hh K n chosen, for all j from 1 to K n , such 
that 



diam (Cj) < ^ . 



(13) 



Hence 



K n < 



8C R L 



, . (14) 

ed n y/k n r) n J 

Now, for 1 < j < K n , let f3j be in Cj] using the definition of A»(/3) and the 
triangular inequality, we have 
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. min ^|[A,(^)-E(A J (/3)|T„)]- [A J (/3 i )-E(A,(/3 J )|T n ) 

J — l,...,K n 

1=1 

n 

2L5 n min £ Ub^C; 1 / 2 ^ - /%), X t ) . 

J — l,...,K n I 



i=l 



Then, using lemma El we get 



.7 — ls-'-i-^n 



i=i 



< 2L6 n ^= . min^ |/3-/%|, 



this last inequality being true only on the set fl n defined by ([6]). Moreover, 
there exists a unique j £ {!)•••) -^n} such that /3 e C J0 , which gives us with 
relation (fT3l) 



j=l,...,K n * — ' 4 

(15) 

On the other hand, we have 



sup |Aj(/3) | < L5 n sup | (B^C; 1 / 2 ^, X, 
<aec /3ec ' • 



and using lemma [3] again, we get, on Q n , 

snp\Am\<^=- (16) 

Besides, for f3 fixed in C, with the same arguments as before, if we denote by 
T* the set of the random variables (X 1? . . . , X n , . . .), we have 



^Var (A 4 (/3)m < $> 2 5 2 Var (| (B^C^A X,) | 2 |T 

i=l i=l 
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Then, using the definition of C„, we remark that 



I (Vl, q C; 1/2 P, X,) = n\(3\ 2 - npFC-^GuC- 1 ' 2 ?, (17) 



which gives us 



^Var(A 4 (/3)m<nL 2 e 



t=i 



We are now able to prove lemma HI Using first relation ( fl5l) . we have 



P 

< P 
and then 



sup ~ E (MP)\Tn)} > e5 2 n n n fi, 



I/3|<1 



1=1 



max £ [Ai((3j) - E (A,^)^)] > -fin H 0, 

z=l 



P 



sup ^ [A ?; (/3) - E (Aj(/3)|T n )] > n a 



I/3|<1 



i=l 



£ [A^-) - e (a^.)|t„)] > -bin n n 



8=1 



By inequalities ffTUl) and ffPBl . we apply Bernstein inequality (see Uspensky, 
1937) and inequality (114p to obtain 



P 



sup V [AiQ3) - E (A,(/3)|T„)] > eS 2 n n n C2 n 



< 2 exp < In 



8C 5 Ln 



k n +q 



e 2 8 A n n 2 /A 



InLHl + 2C 5 L5 n x e^n/(2V^ 



This bound does not depend on the sample T* = (X\, . . . , X n , . . .), hence, if 
we take the expectation on both sides of this inequality above, we deduce 
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sup V [Ai(J3) - E (Aj(/3)|T n )] > eS 2 n n n fl r . 



< 2 exp 



e 2 8 2 ^k n r] n n 



x 



8L 2 yfk~rjn + 4:C 5 L5 n 

■2 



If we fix L = L n = ^nknTjnS^, we have 
Sl^/k n r] n n 1 



L 2 ^k n 7] n k n T] n n-*+oc 



> +oo, 



5l^k n r] n n 
L5 n 



n 



n— »+oo 



-> +oo, 



5l^fk n rj n n 



k n L5 n 



b 2 n ^/k n ri n n ypn, rw+oo 



* 0. 



This leads to 



lim P 

n— >+oo 



sup V [A<(/3) - E (A(/3)|T n )] > e<£n n fi„ 



0. 



and with the fact that Q n has probability tending to 1 when n goes to infinity, 
we finally obtain 



lim P 

n— »+oo 



sup Y] [Ai(/3) - E (A(/3)|T„)] > e5; 
l/3|<if^ 



n 



0. 



which achieves the proof of lemma HI 
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5.4 Proof of lemma [5] 

Let a and b be two real numbers. We denote by the random repartition 
function of q given T n and by /j e the random density function of e« given 
T n . As E (l a (ei + b)\T n ) = f R l a ( s + b) dF i€ (s), we obtain, using a Taylor 
linearization at first order, the existence of a quantity r ia b such that 



E (/ a (cj + a + 6) - l a {ti + 6)|T n ) = / ie (0)a 2 + 2f i€ (0)ab + {— + ab)r iab , 

with r ia b — > when a, 6 — > 0. If we set L' = y/2L and ir^ = V^Ri, this 
relation gives us 



i=l 
n 

= 2^/ ie (0) [^^(B^c^^A^^' + ^niB^c; 1 / 2 ^,^)^ 

1=1 

n 



ne, (19) 



with — ► 0. Considering (3 such that \(3\ = 1, we have, using relation (jHJ) 



(20) 



Moreover, if we set V n = sup| /3 | =1 max i=1 . n \r if3 \, then with condition (H.4) 
l{y„<mi ni / ie (o)/4} = 1r for n large enough, and 



< 



mini /ie(0) 



< 2min/ ie (0) 



A^ 2 (B^C; 1/2 AX,) 2 + § 



(21) 
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Using inequalities (|2"U1) and f[2"Tj) . relation ([Til becomes then 



n 

[Z a (e, - L5 n (W k Jb- x l*P,Xf) - Ri) - L - Ri) \T n 

8=1 



> 2min/ ie (0) 



5 ^E(biA 1/2 a^) 2 9<r *" 



Now, we come back to the definition of function /j to obtain 



n 

- inf ^Te [f t (L5 n (3) - fi(0)\T n ] 



> 2 min /*(()) 



5L' 2 
16n 



9C 2 



1=1 

(m) 



2 
ri 

(m) 



Reminding that L = 2L and taking £ = min(| min, /i e (0), 1), we have 
£ > by hypothesis (i?.4) and then 



1 n 

— inf rE^-^ojirj 

t)^n 0=1 -^—f 

" 1=1 



> £L 2 inf 

I0l=i 



i=i 



(m) 



9 
4 



m ^ e(0) ^ + ^ ( K e ^ 1/2/3 ) {m ' ' ( B ^*) (m) >- 

Using relation (TlTl) . we get 



(m) 



j- E E t/i(^n/3) - /i(o)|r n ] 



9 



> ^ -4-^(0)7^ + ^((BI, ? C;^) (m) , (B^) (m) ). 
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Moreover, for \(3\ = 1, the infimum of ((B^C; 1/2 /3) (m) , (B£ j(? 0*) (m) ) is ob- 
tained for f3 = — C]/ 2 0* / \ C]/ 2 0*\. Using the fact that the spline approxima- 
tion has a bounded m-th derivative, we deduce the existence of a constant 
C 9 > such that 

^((^C-^W, (BI^) (m) ) > 



hence we obtain 



> 



1 - 

*r ,2? E E - /<(°) i T «] 

- | min ^ e (0)-g^ - 2C 9 -^ 



that is to say 



1 n 



Reminding that we have fixed L = L n = ^nk n T] n 5 2 , we get 
1 , 1 k n rj n 



for 8 n ~ , we have 



nrin L 2 kl p 5l np A k 2 n p *-+°° ' 



f ,2 P i P PV™ n 

for o n ~ , we have — ~ —= > 0. 



This leads to 



lim P . .,, 

n— >+oo \ o z 



which achieves the proof of lemma EJ 



n \ 
rt n |/3|_i , =i y 
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5.5 Proof of lemma i 

Writing Tx = (Tx — T n ) + T n , we make the following decomposition 

11/ - g\\l = 2\\V X - r B || (H/ll 2 + N| 2 ) + 11/ - g\\ 2 n . (22) 

Now, let us decompose / as follows / = P+R with P(t) = X^o 1 ff/^(0) an d 
R(t) = Jo { \Si)\ ' f {m K u ) du. P belongs to the space V m -\ of polynomials 
of degree at most m — 1, whose dimension is finite and equal to m. Using 
hypothesis {H.3), there exists a constant C§ > such that we have ||P|| 2 < 
C 6 1| -fH ^. Then, we can deduce 

ll/H 2 < 2||P|| 2 + 2|| J R|| 2 

< 2C 6 ||P|| 2 +2HPH 2 

< 4C 6 ||/|| 2 +4C 6 ||r„,|| || J R|| 2 + 2|| J R|| 2 . (23) 

As T n is a bounded operator (by hypothesis (H.l)), there exists a constant 
C-j > such that we have ||r n || < Cj. Moreover, under Cauchy-Schwarz 
inequality, there exists a constant C 8 > such that ||-R|| 2 < Csll/^ll 2 . 
Relation (JED gives ||/|| 2 < 4C 6 ||/|| 2 + (4C 6 C 7 + 2) C 8 ||/( m ) || 2 . Then, if we 
write / = (/ — g) + g, we finally deduce 

ll/ll 2 < 8C 6 ||/-^|| 2 + (8C 6 C 7 + 4)C 8 ||(/-^)( m )|| 2 

+8C 6 ||r n || \\g\\ 2 + (8C 6 C 7 + A)C s \\g^\\ 2 . (24) 

We have supposed that \\g\\ and ||<r m ^|| are bounded, so 

8C 6 ||r n || \\g\\ 2 + (8C 6 C 7 + 4) C 8 ||^|| 2 = 0(1), 

and the hypothesis \\f - g\\l + p\\{f - g) {m) f = P (u n ) gives us the bounds 
||/ - g\\ 2 n = P (u n ) and ||(/ - g)^\\ 2 = P (u n /p). Then, relation flM} 
becomes 
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\\f\\ 2 = P (l + ^). (25) 

Finally, we have HTx - T n || = opipS 5-1 ^ 2 ) = op(p) from lemma 5.3 of Cardot 
et al. (1999). This equality, combined with equations (l22l) and (J25l) gives us 
11/ ~~ fl'lli = Op(w n ), which is the announced result. 
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